Search CORE

136 research outputs found

RNALOSS: a web server for RNA locally optimal secondary structures

Author: Clote P.
Publication venue: Oxford University Press
Publication date: 01/01/2005
Field of study

RNAomics, analogous to proteomics, concerns aspects of the secondary and tertiary structure, folding pathway, kinetics, comparison, function and regulation of all RNA in a living organism. Given recently discovered roles played by micro RNA, small interfering RNA, riboswitches, ribozymes, etc., it is important to gain insight into the folding process of RNA sequences. We describe the web server RNALOSS, which provides information about the distribution of locally optimal secondary structures, that possibly form kinetic traps in the folding process. The tool RNALOSS may be useful in designing RNA sequences which not only have low folding energy, but whose distribution of locally optimal secondary structures would suggest rapid and robust folding. Website:

CiteSeerX

Crossref

PubMed Central

DiANNA: a web server for disulfide connectivity prediction

Author: Clote P.
Ferrè F.
Publication venue: Oxford University Press
Publication date: 01/01/2005
Field of study

Correctly predicting the disulfide bond topology in a protein is of crucial importance for the understanding of protein function and can be of great help for tertiary prediction methods. The web server outputs the disulfide connectivity prediction given input of a protein sequence. The following procedure is performed. First, PSIPRED is run to predict the protein's secondary structure, then PSIBLAST is run against the non-redundant SwissProt to obtain a multiple alignment of the input sequence. The predicted secondary structure and the profile arising from this alignment are used in the training phase of our neural network. Next, cysteine oxidation state is predicted, then each pair of cysteines in the protein sequence is assigned a likelihood of forming a disulfide bond—this is performed by means of a novel architecture (diresidue neural network). Finally, Rothberg's implementation of Gabow's maximum weighted matching algorithm is applied to diresidue neural network scores in order to produce the final connectivity prediction. Our novel neural network-based approach achieves results that are comparable and in some cases better than the current state-of-the-art methods

CiteSeerX

PubMed Central

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

BTW: a web server for Boltzmann time warping of gene expression time series

Author: Clote P.
Ferrè F.
Publication venue: Oxford University Press
Publication date: 01/01/2006
Field of study

Dynamic time warping (DTW) is a well-known quadratic time algorithm to determine the smallest distance and optimal alignment between two numerical sequences, possibly of different length. Originally developed for speech recognition, this method has been used in data mining, medicine and bioinformatics. For gene expression time series data, time warping distance is arguably a more flexible tool to determine genes having similar temporal expression, hence possibly related biological function, than either Euclidean distance or correlation coefficient—especially since time warping accommodates sequences of different length. The BTW web server allows a user to upload two tab-separated text files A,B of gene expression data, each possibly having a different number of time intervals of different durations. BTW then computes time warping distance between each gene of A with each gene of B, using a recently developed symmetric algorithm which additionally computes the Boltzmann partition function and outputs Boltzmann pair probabilities. The Boltzmann pair probabilities, not available with any other existent software, suggest possible biological significance of certain positions in an optimal time warping alignment. Availability:

CiteSeerX

Crossref

PubMed Central

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

DiANNA 1.1: an extension of the DiANNA web server for ternary cysteine classification

Author: Clote P.
Ferrè F.
Publication venue: Oxford University Press
Publication date: 01/01/2006
Field of study

DiANNA is a recent state-of-the-art artificial neural network and web server, which determines the cysteine oxidation state and disulfide connectivity of a protein, given only its amino acid sequence. Version 1.0 of DiANNA uses a feed-forward neural network to determine which cysteines are involved in a disulfide bond, and employs a novel architecture neural network to predict which half-cystines are covalently bound to which other half-cystines. In version 1.1 of DiANNA, described here, we extend functionality by applying a support vector machine with spectrum kernel for the cysteine classification problem—to determine whether a cysteine is reduced (free in sulfhydryl state), half-cystine (involved in a disulfide bond) or bound to a metallic ligand. In the latter case, DiANNA predicts the ligand among iron, zinc, cadmium and carbon. Available at:

Crossref

PubMed Central

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Boolean functions, invariance groups, and parallel complexity

Author: Clote P.
Kranakis E. (Evangelos)
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/01/1991
Field of study

CWI's Institutional Repository

Divergence and Shannon information in genomes

Author: C. E. Shannon
C. E. Shannon
Chang-Heng Chang
H. M. Xie
H. O. Smith
Hong-Da Chen
Hoong-Chien Lee
L. L. Gatlin
Li-Ching Hsieh
P. Clote
Publication venue: 'American Physical Society (APS)'
Publication date: 17/12/2004
Field of study

Shannon information (SI) and its special case, divergence, are defined for a DNA sequence in terms of probabilities of chemical words in the sequence and are computed for a set of complete genomes highly diverse in length and composition. We find the following: SI (but not divergence) is inversely proportional to sequence length for a random sequence but is length-independent for genomes; the genomic SI is always greater and, for shorter words and longer sequences, hundreds to thousands times greater than the SI in a random sequence whose length and composition match those of the genome; genomic SIs appear to have word-length dependent universal values. The universality is inferred to be an evolution footprint of a universal mode for genome growth.Comment: 4 pages, 3 tables, 2 figure

arXiv.org e-Print Archive

Crossref

Asymptotic structural properties of quasi-random saturated structures of RNA

Author: Clote P. (Peter)
Kranakis E. (Evangelos)
Krizanc D. (Danny)
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 25/10/2013
Field of study

Background: RNA folding depends on the distribution of kinetic traps in the landscape of all secondary structures. Kinetic traps in the Nussinov energy model are precisely those secondary structures that are saturated, meaning that no base pair can be added without introducing either a pseudoknot or base triple. In previous work, we investigated asymptotic combinatorics of both random saturated structures and of quasi-random saturated structures, where the latter are constructed by a natural stochastic process.Results: We prove that for quasi-random saturated structures with the uniform distribution, the asymptotic expected number of external loops is O(logn) and the asymptotic expected maximum stem length is O(logn), while under the Zipf distribution, the asymptotic expected number of external loops is O(log2n) and the asymptotic expected maximum stem length is O(logn/log logn).Conclusions: Quasi-random saturated structures are generated by a stochastic greedy method, which is simple to implement. Structural features of random saturated structures appear to resemble those of quasi-random saturated structures, and the latter appear to constitute a class for which both the generation of sampled structures as well as a combinatorial investigation of structural features may be simpler to undertake

Carleton University's Institutional Repository

PubMed Central

Complexity Bounds for Ordinal-Based Termination

Author: A. Weiermann
A. Weiermann
B. Cook
C. Alias
C. Urban
D. Hofbauer
D.H.J. Jongh de
E.A. Cichoń
G. Bonfante
I. Lepper
I. Lepper
J.P. Jouannaud
K. McAloon
M.H. Löb
N. Dershowitz
N. Hirokawa
P. Clote
S. Abriola
S. Gulwani
S. Schmitz
S. Schmitz
T. Colcombet
W. Buchholz
W. Bucholz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

`What more than its truth do we know if we have a proof of a theorem in a given formal system?' We examine Kreisel's question in the particular context of program termination proofs, with an eye to deriving complexity bounds on program running times. Our main tool for this are length function theorems, which provide complexity bounds on the use of well quasi orders. We illustrate how to prove such theorems in the simple yet until now untreated case of ordinals. We show how to apply this new theorem to derive complexity bounds on programs when they are proven to terminate thanks to a ranking function into some ordinal.Comment: Invited talk at the 8th International Workshop on Reachability Problems (RP 2014, 22-24 September 2014, Oxford

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

Distribution of graph-distances in Boltzmann ensembles of RNA secondary structures

Author: A. Kobitski
A.M. Yoffe
A.P. Baraniak
A.X. Li
C.J. McManus
C.M. Reidys
D. Dufour
D. Leipply
D. Mathews
D.H. Mathews
H.S. Han
J.S. McCaskill
K. Darty
L.T. Fang
M. Müller
P. Clote
P. Schuster
R. Das
R. Giegerich
R. Lorenz
R. Lorenz
R. Roy
R.A. Forties
T.R. Einert
U. Gerland
U. Gerland
Y. Ding
Publication venue
Publication date: 01/01/2013
Field of study

Large RNA molecules often carry multiple functional domains whose spatial arrangement is an important determinant of their function. Pre-mRNA splicing, furthermore, relies on the spatial proximity of the splice junctions that can be separated by very long introns. Similar effects appear in the processing of RNA virus genomes. Albeit a crude measure, the distribution of spatial distances in thermodynamic equilibrium therefore provides useful information on the overall shape of the molecule can provide insights into the interplay of its functional domains. Spatial distance can be approximated by the graph-distance in RNA secondary structure. We show here that the equilibrium distribution of graph-distances between arbitrary nucleotides can be computed in polynomial time by means of dynamic programming. A naive implementation would yield recursions with a very high time complexity of O(n^11). Although we were able to reduce this to O(n^6) for many practical applications a further reduction seems difficult. We conclude, therefore, that sampling approaches, which are much easier to implement, are also theoretically favorable for most real-life applications, in particular since these primarily concern long-range interactions in very large RNA molecules.Comment: Peer-reviewed and presented as part of the 13th Workshop on Algorithms in Bioinformatics (WABI2013

arXiv.org e-Print Archive

Crossref

Fraunhofer-ePrints

Thermodynamics of RNA structures by Wang–Landau sampling

Author: Abrashams
Bekaert
Bernhart
Bradley
B ck
Cheah
Chen
Clote
Danilova
Dimitrov
Dirks
Eddy
F. Lou
Flamm
Flamm
Griffiths-Jones
Hofacker
Kirkpatrick
Knudsen
Kou
Lim
Lyngs
Mandal
Markham
Metzler
Nussinov
Omer
Ortiz
P. Clote
Reeder
Reinisch
REN
Rivas
Tabaska
Tucker
van Batenburg
Wang
Weinger
Wuchty
Xayaphoummine
Zhang
Zhao
Zuker
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Motivation: Thermodynamics-based dynamic programming RNA secondary structure algorithms have been of immense importance in molecular biology, where applications range from the detection of novel selenoproteins using expressed sequence tag (EST) data, to the determination of microRNA genes and their targets. Dynamic programming algorithms have been developed to compute the minimum free energy secondary structure and partition function of a given RNA sequence, the minimum free-energy and partition function for the hybridization of two RNA molecules, etc. However, the applicability of dynamic programming methods depends on disallowing certain types of interactions (pseudoknots, zig-zags, etc.), as their inclusion renders structure prediction an nondeterministic polynomial time (NP)-complete problem. Nevertheless, such interactions have been observed in X-ray structures

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

PubMed Central

HAL-Polytechnique

HAL-Rennes 1